Questionnaire & Opinion Survey
- North America > United States > Virginia > Fairfax County (0.05)
- North America > United States > Minnesota (0.05)
- North America > United States > Kansas (0.05)
- (5 more...)
- Media > News (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
Adaptive Budget Allocation in LLM-Augmented Surveys
Ye, Zikun, Lyu, Jiameng, Tao, Rui
Large language models (LLMs) can generate survey responses at low cost, but their reliability varies substantially across questions and is unknown before data collection. Deploying LLMs in surveys still requires costly human responses for verification and correction. How should a limited human-labeling budget be allocated across questions in real time? We propose an adaptive allocation algorithm that learns which questions are hardest for the LLM while simultaneously collecting human responses. Each human label serves a dual role: it improves the estimate for that question and reveals how well the LLM predicts human responses on it. The algorithm directs more budget to questions where the LLM is least reliable, without requiring any prior knowledge of question-level LLM accuracy. We prove that the allocation gap relative to the best possible allocation vanishes as the budget grows, and validate the approach on both synthetic data and a real survey dataset with 68 questions and over 2000 respondents. On real survey data, the standard practice of allocating human labels uniformly across questions wastes 10--12% of the budget relative to the optimal; our algorithm reduces this waste to 2--6%, and the advantage grows as questions become more heterogeneous in LLM prediction quality. The algorithm achieves the same estimation quality as traditional uniform sampling with fewer human samples, requires no pilot study, and is backed by formal performance guarantees validated on real survey data. More broadly, the framework applies whenever scarce human oversight must be allocated across tasks where LLM reliability is unknown.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Questionnaire & Opinion Survey (1.00)
- Research Report (0.81)
Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior
Inferring intent from observed behavior has been studied extensively within the frameworks of Bayesian inverse planning and inverse reinforcement learning. These methods infer a goal or reward function that best explains the actions of the observed agent, typically a human demonstrator. Another agent can use this inferred intent to predict, imitate, or assist the human user. However, a central assumption in inverse reinforcement learning is that the demonstrator is close to optimal. While models of suboptimal behavior exist, they typically assume that suboptimal actions are the result of some type of random noise or a known cognitive bias, like temporal inconsistency. In this paper, we take an alternative approach, and model suboptimal behavior as the result of internal model misspecification: the reason that user actions might deviate from near-optimal actions is that the user has an incorrect set of beliefs about the rules -- the dynamics -- governing how actions affect the environment. Our insight is that while demonstrated actions may be suboptimal in the real world, they may actually be near-optimal with respect to the user's internal model of the dynamics. By estimating these internal beliefs from observed behavior, we arrive at a new method for inferring intent. We demonstrate in simulation and in a user study with 12 participants that this approach enables us to more accurately model human intent, and can be used in a variety of applications, including offering assistance in a shared autonomy framework and inferring human preferences.
- North America > Canada (0.14)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Minnesota (0.04)
- (2 more...)
- Questionnaire & Opinion Survey (0.68)
- Research Report > New Finding (0.67)
- Consumer Products & Services (0.46)
- Health & Medicine (0.46)
SAFEWORLD: Geo-DiverseSafetyAlignment
Despite significant progress inthisarea, anessential factor often remains overlooked:geo-diversity. Recognizing and incorporating geographical variations [41, 40, 4, 10, 31, 6] in safety principles is crucial in the global landscape of LLM safety. Cultural norms and legal frameworks vary widely, resulting in diverse definitions of safe and acceptable behavior. As shown in Figure 1, while giving a green hatasagift might bebenign inmanycultures, itisconsidered offensiveinChina.
- North America > Canada > Ontario > Toronto (0.04)
- North America > Dominican Republic (0.04)
- Asia > Singapore (0.04)
- (3 more...)
- Research Report (0.67)
- Questionnaire & Opinion Survey (0.67)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Research Report (1.00)
- Questionnaire & Opinion Survey (0.68)
- Information Technology > Security & Privacy (0.46)
- Transportation > Ground > Road (0.46)
- Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.40)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.95)
- Questionnaire & Opinion Survey (0.69)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.27)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > Russia (0.14)
- (92 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
- (2 more...)
- Media (1.00)
- Leisure & Entertainment (1.00)
- Information Technology > Security & Privacy (1.00)
- (10 more...)
- North America > United States > Arizona (0.04)
- Europe > France (0.04)
- North America > United States > Virginia (0.04)
- (9 more...)
- Questionnaire & Opinion Survey (1.00)
- Personal > Interview (1.00)
- Research Report > New Finding (0.67)
- Overview (0.67)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
- (2 more...)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
- North America > United States > California (0.04)
- (6 more...)
- Research Report > New Finding (0.92)
- Questionnaire & Opinion Survey (0.68)
- Government (0.92)
- Education (0.70)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.98)
- (2 more...)